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ABSTRACT 

We develop an improved mass tracer for clusters of galaxies from optically ob- 
served parameters, and calibrate the mass relation using weak gravitational lensing 
measurements. We employ a sample of ~ 13 000 optically-selected clusters from the 
Sloan Digital Sky Survey (SDSS) maxBCG catalog, with photometric redshifts in 
the range 0.1-0.3. The optical tracers we consider are cluster richness, cluster lumi- 
nosity, luminosity of the brightest cluster galaxy (BCG), and combinations of these 
parameters. We measure the weak lensing signal around stacked clusters as a func- 
tion of the various tracers, and use it to determine the tracer with the least amount 
of scatter. We further use the weak lensing data to calibrate the mass normaliza- 
tion. We find that the best mass estimator for massive clusters is a combination 
of cluster richness, A^oo, an d the luminosity of_the brightest cluster galaxy, Lbcg : 
M 200 p = (1.27±0.08)(7V 2 oo/20) 1 - 20±0 - 09 (L B CG/iBCG(iV2oo)) - 71±0 - 14 x lO^h^M®, 
where £bcg(-^20o) is the observed mean BCG luminosity at a given richness. This 
improved mass tracer will enable the use of galaxy clusters as a more powerful tool 
for constraining cosmological parameters. 

Key words: clusters - weak lensing; galaxy clusters. 



1 INTRODUCTION 

Clusters of galaxies trace the matter density distribution 
in the Universe, and they have long been used success- 
fully as powerful cosmological probes. Relating the observed 
cluster abundance to the dark matter halo abundance pre- 
dicted by cosmological simulations provides powerful con- 
straints on a range of cosmological parameters, including the 
amplitude of m atter fluctuations, neutrino mass and dark 
energy density llBahcall fc Cen Il992t iHaiman et alj l200ll; 



Weller fc Battvd 120031: IWang et all 120051 : lAlbrecht et all 
20061 ; iMandelbaum fc Seliakl l2007f T The strength of these 
constraints arises from the exponential cutoff in the cluster 
mass function for the most massive clusters, which depends 
strongly on both the amplitude of matter fluctuations and 
the matter density. 

Currently, the use of clusters as precise cosmological 
probes is limited by the lack of reliable mass estimates for 
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a large sample of clusters. While hydrodynamic simulations 
can provide estimates for the relation bet ween X-ray ob- 
servable parameters and cluster mass (e.g., iKravtsov et al.l 
l200fj ; iNagai et af] 120071 ), it is not clear that all the rele- 
vant physics determining these relations exist in the sim- 
ulations. Estimating the virial mass of individual clus ters 
using X-ray measurements (e.g., ISchmidt fc Allenll2007l ) re- 
quires the assumption of hydrostatic equilibrium, which in- 
troduces potential systematics for non-relaxed clusters, and 
neglects the effects of non-thermal pressure support, such 
as that from turbulence, cosmic rays, and magnetic fields. 
There is a hint of a ~ 20 per cent conflict between theo- 
retical predictions and ob servations for the n o rmalizations 
of th ese mass relations (jArnaud et alj 120071 : INagai et al.l 
120071 ). This discrepancy between hydrostatic masses and to- 
tal m ass also appears to b e supported by observational re- 
sults dMahdavi et al-lfeoOcf l.Thus. a careful treatment is nec- 
essary before they can be used for precision cosmology. 

A way to estimate cluster masses that is insensitive to 
the dynamical state of the system is through weak gravita- 
tional lensing measurements. These directly probe the total 
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(dark plus luminous) matter distribution. Estimates of the 
mass of individual clusters using weak lensing are currently 
limited to ~ 30 per cent uncertainties by the signal-to-noise 
ratio of the lensing meas urements, for clusters with M500 ~ 
few x lQ 14 fe~ 1 M s (e.g., lHoekstra|[2007l ; iPedersen fc Dahld 
120071 ). They are also subject to systematics such as the shear 
and source redshift calibration, and limitations due to pro- 
jection e ffects of matter near the cluster or along the line- 
of-sight ijMetzler et al.ll200ll : lHoekstrall2003l ). These probes 
can be augmented by strong gravitational lensin g measure- 
ments (|Brada5 et al.l [20051 ; ICacciato et al. 2006) and veloc- 
ity dispersion measurements ( Becker et al.l 12007) to aid in 
the cluster ma s s dete rmin ation (e . g., us ing the methods of 
iMahdavi et"afl i|2007l ) and lSerenol <j2007ft l. 

Here, we calibrate the mass relations for a range of op- 
tical parameters using measurements of the stacked weak 
lensing signal around a large set of clusters. This approach 
is complementary to those methods that provide mass es- 
timates for individual clusters, which cannot currently be 
fully applied to large datasets. For example, velocity disper- 
sion measurements are limited by the practical difficulty of 
obtaining spectroscopic observations for a large number of 
clusters. Our method for mass calibration can be readily ap- 
plied to datasets from upcoming large-scale surveys, such as 
DE^B Pan-STARRS^ and 1SST0. 

We employ the largest available sampl e of ~13 000 
galaxy clusters (maxBCG cluster catalog; iKoester et al.l 
200 7a||3) selected from the Sloan Digital Sky Survey (SDSS; 
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York et al]|2000h . Stacking the weak lensing signals around 



many clusters increases the signal-to-noise ratio that we can 
achieve. The availability of accurate photometric redshifts 
for all objects in the sample also improve our mass mea- 
surements. Independent weak lensin g analyses of clusters 
in this catalog have been performed (ISheldon et al-l feoOTa; 
IJohnston et alj|2007t ISheldon et~alll2007bl ). Closest to this 
work is IJohnston et al. I (|2007l), where scaling relations of 
cluster mass with optical richness and cluster luminosity 
were obtained using a different method for estimating the 
cluster mass. 

In this work, we consider optical tracers available in 
large cluster surveys, such as cluster richness, cluster lumi- 
nosity, and luminosity of the brightest cluster galaxy (BCG) , 
and assess how well these parameters trace the cluster mass. 
In addition, we consider combinations of these parameters 
and assess whether they provide better mass determinations. 
Finding the most faithful tracer of cluster mass among the 
available options will allow us to fully harness the power of 
clusters in constraining cosmological parameters. 

The paper is organized as follows. In Sec. [5] we describe 
the cluster catalog and the weak lensing measurements. In 
Sec. [3] we describe how we use stacked weak lensing mea- 
surements to estimate cluster masses, and discuss our ap- 
proach for assessing mass tracers in Sec. 13.51 Sec. [4] deals 
with various tests of systematics. We present our results in 
Sec. [5] and conclude in Sec. [S] 
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Figure 1. Correlation of JVjjoo (cluster richness in red galaxies) 
with L200 (cluster luminosity in red galaxies), Lbcg (BCG lu- 
minosity), and the combination A^oo-^BCG ' 75 (with luminosities 
in units of W 10 h~ 2 Lq). The cluster sample, which is selected by 
richness (N200 3 s 10): i s complete above £200,10 = 30 and above 
A^OO ^bcg 10 = 80 (solid horizontal lines), but is not complete in 
^BCG a * an y value. 



2 DATA 

In this section, we describe the SDSS data (Sec. l2.l) l. the lens 
cluster sample from the maxBCG cluster catalog (Sec. 12.2] ). 
and the source galaxy catalog used in the weak lensing anal- 
ysis (Sec. [Ull . 



2.1 SDSS Data 

The maxBCG cluster catalog and the lensing source catalog 
come from the SDSS, a survey to image roughly ir stera- 
dians of the sky, and follow up approxima tely one million 
of the detected objects spectroscopically ( Eisenstein et al.l 
120011 ; iRichards et all 120021 ; IStrauss et al'j|2002h . The imag- 
ing is carried out by drift-scanning the sky in ph oto- 
metric conditions dHogg et al.ll200ll; llvezic et all |2004|). in 
five bands (ugriz) (|Fukugita et al.lll996l ; ISmith et al.ll2002r) 
using a specially-designed wide-field camera (|Gunn et all 
1998). These imaging data are used to create the source 
catalog that we use in this paper. In addition, ob- 
j ects are targeted f or spectroscopy using these data 
l|Blanton et alj l2003al ) and are observed wit h a double 
320-fi ber spectrograph on the same telescope l|Gunn et all 
2006). All of these data are processed by automated 
pipelines that detect and measure photometric proper- 
ties of sources, an d astrometrically calibrate the dat a 
l|lupton et all l200ll ; iPier et all 120031 ; iTucker et all 120061 ). 
The SD SS is nearly complete, and has had seven major data 
release s JStoughton e ^M^f^ ^ZZMMMi, 
2004 
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2005 1; iFinkbeiner et all 120041; lAdelman-McCarthv et all 
20061 . 120071 ; lAdelman- McCarthy et. alj|2008j ). ~ 
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2.2 Cluster Lens sample 

Our lens sample consists of 12 612 clusters from the public 
maxBCG catalog, with richness in red galaxies of N200 ^ 
10 (where the galaxy count includes galaxies brighter than 
0.4L, and located within a scaled radius of V2oo, defined in 
Eq.[T] below). The clusters have photometric redshifts in the 
range of z = 0.1-0.3, selected over a 0.5 (/i^Gpc) 3 volume 
covering 7500 deg 2 of sky. Our sample excludes ~ 9 per cent 
of the solid angle covered by the survey where lensing shape 
measurements of source galaxies are currently not available. 
The maxBCG catalog is presented and discussed in detail by 
iKoester et all (|2007al lbh. In this section, we briefly describe 
the cluster finder algorithm, and define the cluster properties 
used in this work. 

The maxBCG cluster finder exploits the existence of the 
E/S0 red ridgeline of cluster galaxies in the color-magnitude 
diagram, and of a brightest cluster galaxy (BCG) found near 
the centre of most clusters. For each galaxy, it obtains a 
photometric redshift estimate by maximizing the likelihood 
that (i) it is located in an overdensity of E/S0 ridgeline 
galaxies of similar colors, and (ii) it has colors and magni- 
tudes of a typical BCG at that redshift. It also determines 
A^iMpc, the number of E/S0 ridgeline galaxies located within 
a projected distance of l/i -1 Mpc of the galaxy, which are 
dimmer than the galaxy and brighter than 0.4L*, where 
L* = 2.08 x lO lo /i~ 2 L in the i band at z = 0.1, with 
a dependence on redshift determined from a Pegase-2 stel- 
lar population/galaxy formation model, similar to that of 
lEisenstein et alf^OOlh . It then chooses the galaxy with the 
highest likelihood and -/Vim P c as a bona fide BCG. 

To identify cluster members, the cluster size is esti- 
mated to be r2oo, the radius within which the galaxy num- 
ber density of the cluster is 200Q^ times the mean density 
of galaxies in the present Universe. The scal ed radius r2oo 
is esti mated from the empirical relation from lHansen et all 
l|2005l ): 



r 20 o = 0.156iV 1 ^ P c^" 1 Mpc. 



(1) 



The cluster finder identifies galaxies within a scaled radius 
r2oo of the BCG, removes them from the list of potential 
cluster centres, and continues down the list of galaxies with 
lower likelihood and lower A^iM pc until all candidate s are 
exhausted. For more details, see IKoester et all (|2007al ibh. 

IKoester et ail (|2007al lbh performed tests of purity and 
completeness of the maxBCG catalog using mock catalogs 
from iV-body simulations. They found that the sample is 
more than 90 per cent pure for clusters with A^oo 10; 
and 90-95 per cent pure for clusters with -/V200 20. 
The sample is >90 per cent complete for masses M200 > 
2 x 1O 14 /i _1 M0, and > 95 per cent complete for masses 
M200 > 3 x 1O 14 /i _1 M0, where M200 is the mass within 
r2oo- These results are of course subject to the assumption 
that the mock catalogs are a faithful representation of the 
clusters. 

In this work, we use three optical properties of clusters 
that are reported in the maxBCG catalog: 

• -/V200 (cluster richness): the number of E/S0 ridgeline 
member galaxies fainter than the BCG, brighter than 
0AL* , and located within a projected distance V2oo (given 
by Eq.[TJ from the BCG. 



• £200 (cluster luminosity): the summed r band lumi- 
nosities of the BCG and the ridgeline member galaxies 
included in N200, fc-corrected to z = 0.25. We usually 
express this luminosity in units of 10 10 h~ 2 Lq and denote it 
by -1*200,10 ■ 

• Lbcg (BCG luminosity): the r band luminosity of the 
BCG, fc-corrected to z = 0.25. We usually express this 
luminosity in units of W 10 h~ 2 L Q and denote it by Lbcg.io- 



These luminosities are based on SDSS 'cmodel' magnitudes, 
which are constructed from a weighted combination of de 
Vaucouleurs and exponential magnitudes. The weights are 
determined by fitting the galaxy surface brightness profile 
with a linear combination of the best-fitting de Vaucouleurs 
and exponential profiles. A'-corrections are c alculated from 
the LR G template in v4.1.4 of KCORRECT (jBlanton et all 
l2003bl ). using photometric redshifts and without applying 
a correction for evolution. Galactic e xtinction corre c tion i s 
applied using the extinction maps of ISchlegel et~ai1 l|l998h . 
We note that these luminosities may be underestimated (at 
the 10 per cent level) due to systematic errors in sky sub- 
traction, which is most severe i n galaxies of large extent 
ijAdelman-McCarthv et. al.ll2008l ). 

Figure [T] shows the correlation of the cluster richness 
in red galaxies A200 with other optical parameters for the 
richness-selected cluster sample (JV200 S5= 10). There is a 
strong correlation between A200 and L200 (with a rank cor- 
relation coefficient of 0.68). The sample is complete for clus- 
ter luminosities Z/200,10 ^ 30 (uppermost panel). On the 
other hand, while the minimum value of Lbcg correlates 
with -/V200, the maximum value of Lbcg does not. The two 
parameters are weakly correlated, with rank correlation co- 
efficient is 0.30. The scatter in Lbcg at fixed richness has a 
Gaus sian distribution with width > 0.17 dex l|Hansen et al.l 
2007). The sample is not complete in Lbcg even at the 
brightest end (middle panel). However, the sample is com- 
plete at A^ooLbcg.io ' 75 ^ 80 (lowermost panel). The la 
statistical error in the luminosities is roughly 0.06 dex (dom- 
inated by photometric redshift error), and is much smaller 
than the observed scatter. 



2.3 Source catalog 

The source galaxy sample used for the weak lensing mea- 
surements is the s ame a s that originally described in 
iMandelbaum et all l|2005ah . hereafter M05. This source 
sample includes over 30 million galaxies from the SDSS 
imaging data with r-band model magnitude brighter than 
21.8, with shape measurements obtained using the RE- 
GLENS pipeline, including PSF corre ction done via re- 
Gaussianization (jHirata fc Seliakl [2003) and with cuts de- 
signed to avoid various shear calibration biases. A full de- 
scription of this pipeline can be found in M05. 

The REGLENS pipeline obtains galax y images in the r 
and i filters from the SDSS "atlas images" ([Stoughton et al] 
2002). The basic principle of shear measurement using these 
images is to fit a Gaussian profile with elliptical isophotes 
to the image, and define the components of the ellipticity 

(e+,e x ) = ^—^7^2 (cos 20, sin 2^), 
1 + (b/a)^ 



(2) 
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where b/a is the axis ratio and (f> is the position angle of 
the major axis. The ellipticity is then an estimator for the 
shear, 



(7+,7x) = ^=r((e+,e x )), 



(3) 



where 1Z ~ 0.87 is called the "shear responsivity" and repre- 
sents the response o f the ellipticity (Eq . [2| to a small shear 
ijKaiser et all Il995l : iBernstein fc Jarvis! 2002). In practice, 
a number of corrections need to be applied to obtain the 
ellipticity. The most important of these is the correction 
for the smearing and circularization of the galactic images 
by the PSF; M05 uses the PSF maps obtained from stel- 
lar images by the psp pipeline l|Lupton et al.| [2001). and 
corrects for these usin g the re-Gaussianization technique of 
Hir ata fc Seliakl (|2003h . which includes corrections for non- 
Gaussianity of both the galaxy profile and the PSF. In order 
for these corrections to be successful, we require that the 
galaxy be well-resolved compared to the PSF in both r and 
i bands (the only ones used for shape measurement). To do 
this we define the Gaussian resolution factor: 



R 2 = 1 



T 



(P) 



(4) 



where the T values are the traces of the adaptive covariance 
matrices, and the superscripts indicate whether they are of 
the PSF or of the galaxy image. A large galaxy (compared 
to the PSF) would have R2 ~ 1, while a star or other unre- 
solved source would have R2 w 0. We require that R2 exceed 
1/3 in both r and i bands. 



3 CLUSTER MASSES FROM STACKED 
WEAK LENSING MEASUREMENTS 

In this section, we describe how we estimate cluster masses 
using stacked weak lensing measurements. We discuss the- 
ory (Sec. 13. l) . computation of the lensing signal (Sec. 13.21) . 
modeling of the density profiles (Sec. I3.3f) . fits to the ob- 
served lensing signal to obtain cluster masses (Sec. 13.4]) . and 
interpretation of the best-fitting masses (Sec. 13.5]) . 



3.1 Theory 

Cluster-galaxy lensing provides a simple way to probe the 
connection between galaxies and matter via their cross- 
correlation function 



£ 9 m(r) = {5 g (x)5 m (S + r)} 



(5) 



where 8 g and 8 m are overdensities of galaxies and matter, 
respectively. This cross-correlation can be related to the pro- 
jected surface density 

E(R) = pj [l + i gm (VR 2 +X 2 )} d X . (6) 

(where r 2 = R 2 + x 2 ) which is then related to the observable 
quantity for lensing, 



AE(R) = 7t (i?)E c = E(< R) - E(i?), 



(7) 



where 74 is the tangential shear. The second relation is true 
only in the weak lensing limit, for a matter distribution 
that is axisymmetric along the line of sight. This symme- 
try is naturally achieved by our procedure of stacking many 



clusters and determining their average lensing signal. This 
observable quantity can be expressed as the product of the 
tangential shear 74 and a geometric factor 



D 3 



4ttG L>l£>ls(1 + zl) 2 



(8) 



where Dl and Ds are angular diameter distances to the lens 
and source, Dls is the angular diameter distance between 
the lens and source, and the factor of (1 + zl) -2 arises due to 
our use of comoving coordinates. For a given lens redshift, 
E" 1 rises from zero at zs = zl to an asymptotic value at 
zs 2> zl; that asymptotic value is an increasing function of 
lens redshift. 

In practice, we truncate the integral in Eq.[5]at the virial 
radius of the cluster (defined in Eg. 1 121 below), motivated by 
attempts to model the lensing signal in simulations (M05). 
Truncation at two times the virial radius would change the 
cluster mass estimates at the 5 per cent level. 



3.2 Signal computation 

To compute the average lensing signal AE(i?), lens-source 
pairs are first assigned weights according to the error on the 
shape measurement via 



+ 



(9) 



SN 



where <j% n , the intrinsic shape noise, was determined as a 
function of magnitude in M05, Figure 3. The factor of E~ 2 
downweights pairs that are close in redshift, converting the 
shape noise in the denominator to a noise in AE. 

Once we have computed these weights, we compute the 
lensing signal in 62 logarithmic radial bins from 0.02 to 9 
h~ Mpc as a summation over lens-source pairs via: 



AE(ft) = 



2ft E ; , 



Wis 



(10) 



where the factor of 2 arises due to our definition of ellipticity. 

There are several additional procedures that must be 
done when computing the signal (for more detail, see M05) . 
First, the signal computed around random points must be 
subtracted from the signal around real lenses to eliminate 
contributions from systematic shear. The measured signal 
around random points is consistent with zero over the range 
of radii we use. Subtraction of this signal introduces noise 
with RMS of ~ 15 per cent on scales from 0.5 to 1 h~ l Mpc, 
and ~ 1 per cent from 1 to 9 h~ x Mpc. 

Second, the signal must be boosted, i.e., multiplied by 
B(R) — n(R) /rirand(-R) j the ratio of the number density 
of sources relative to the number density around random 
points, in order to account for the dilution of the lensing 
signal due to sources that are physically associated with a 
lens (i.e., cluster galaxy members), and therefore not lensed. 
We find that B(R) decreases with increasing distance from 
the center, ranging from ~ 1.2 to 1.4 at R = 0.5 ft -1 Mpc 
(for low to high- mass clusters), and dropping to unity for 
R > 4 h' 1 Mpc. 

To determine errors on the lensing signal, we divide 
the survey area into 200 bootstrap subregions, and gener- 
ate 2500 bootstrap-resampled datasets. Furthermore, to de- 
crease noise in the covariance matrices due to the bootstrap, 
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Figure 2. Observed mean lensing signals around stacked clusters 
in three richness bins (data points; from bottom to top): N2(io = 
10-11, 26-40, and 71-190, with best-fitting masses M 2 oop = 
0.65±0.30, 2.48±0.57, and 8.72±1.40x 10 14 /i _1 AfQ, respectively. 
Also shown are the best-fitting one-halo and halo-halo profiles 
(dotted and long-dashed curves, respectively), the estimated stel- 
lar component (short-dashed curves) and the sum of these three 
(solid curves). The range of scales used for the fits is R =0.5—4.0 
h —1 Mpc (rightward of the vertical dashed line). For this range 
of scales, the stellar contribution is negligible and the halo-halo 
contribution is sub-dominant to the one-halo term. However, the 
halo-halo contribution becomes significant for R > l/i -1 Mpc. We 
model the lensing signal as a sum of the one-halo and halo-halo 
profiles. 



we rebin the signal into 22 radial bins (of which 7 are in the 
range of radii we use for our fits) . 



3.3 Density profiles 

We model the lensing signal as a sum of contributions from 
the cluster-mass cross-correlation from the cluster (one-halo 
term) and from large-scale structure (halo-halo term). At 
small scales, contributions from the stars in the central 
galaxy are also important, but we show that their contri- 
bution is negligible for the range of scales we use for our 
fits (0.5-4 ft -1 Mpc). Figure [2] shows the relative contribu- 
tions of these three components for representative cases. The 
halo-halo term is significant on scales R > lhT 1 Mpc, but 
sub-dominant to the one-halo term on all scales used for the 
fits. 

The cluster mass distribution is modeled as a Navarro- 
Frenk- White (hereafter NF W) profile of cold dark matter 
haloes (|Navarro et al.lll99rj ) 



p(r) = 



(r/r s )(l+r/r s ) 2 ' 



(11) 



defined by two parameters, the concentration c = r v i r /r s 
and the halo mass M 2 oop- While many definitions are used 
in the literature, here we define the virial radius r v j r as the 



radius within which the average density is equal to 200 times 
the mean density of the Universe p, so that 

4-7T i 

M 200 p = — r 3 ir (200p), (12) 

where the subscript denotes that this mass definition uses 
200p rather than the oft-used 200p C rit- The two mass def- 
initions differ by roughly 30 per cent for typical values of 
concentration. 

We take the concentration to be a fixed function of mass 

C (M 2 oo,)=5.of-^-V°' 10 . (13) 



_ 1O 14 M J 

In other words, we assume that the mass distribution only 
depends on a single parameter, the cluster mass M2oop- The 
exponen t in Eq. [T5 mat ches the results of A-body simu- 
lations dNeto et al.l 20071 ) and the normalization is deter- 
mined from the ob served density profiles of clusters in the 
maxBCG catalog (|Mandelbaum et al.|[200l ). We find that 
increasing the normalization from 5.0 to 6.0 results in a de- 
crease in the best-fitting mass of < 3 per cent for most of 
the mass range we consider. In particular, this means that 
when we use a fixed mass-concentration relation, we tend 
to slightly overestimate the masses of clusters with high- 
luminosity BCGs relative to those that have low-luminosity 
ones, since the former tend to have earlier formation times, 
and therefore, higher concentrations. This effect would lead 
to a small positive trend in mass with BCG luminosity at 
fixed richness, but we find that the induced slope (0.025) is 
negligible compared to the observed slopes, 7 in Table [2] To 
estimate this slope, we have used a result from the simula- 
tions of (jCroton et al.l [20071 . ; Fig. 4) that indicates that a 
difference of ~ 1 magnitude in BCG luminosity corresponds 
to a roughly 20 per cent difference in halo concentration. 

The halo-halo contribution to the lensing signal is mod- 
eled using the galaxy-matter cr oss-power spectrum as in, 
e.g., iMandelbaum et al.l l|2005bl ). It is proportional to the 
bias b, the ratio of the galaxy-matter correlation function to 
the matter autocorrelation function . We express the bias a s 
a function of mass or peak height v (|Sheth fe Tormenlll999l '): 

1 2p 



b(u) = 1 + 



(14) 



5 C [1 + (av)"]' 

where the peak height v = 8^/a 2 (M), 5 C = 1.686 is the lin- 
ear overdensity at which a spherical perturbation collapses 
at redshift z, and a{M) is the rms fluctuation in spheres 
that contain an average mass M at an initial time, extrap- 
olated using linear theory to z; we use z = 0.23, the median 
redshift of the sample. For the purposes of computing bias, 
we use a = 0.73 and p = 0.15 in order to match the results of 
ISeliak fc Warrenl l|2004h . For example, at z = 0.23, clusters 
of mass 6 x 10 13 and 6 x 1O 14 /i _1 M have biases of 2.2 and 
5.5, respectively. 

For illustration purposes, we model the stellar compo- 
nent by a Hernquist density profile (|Hernquistll 1990l ) . which 
is similar to the NFW profile in Eq.[TT]but with an exponent 
of 3 instead of 2, so that it falls off faster at large scales. We 
estimate stellar masses from the mean k+e corrected r band 
magnitudes of BCGs in each bin, assuming a m ass-to-light 
ratio of ~ 3Mp)/I/ JPadmanabhan et 811 12004!). following 
IMandelbaum et all l|2006l ). We estimate the Hernquist pro- 
file scale radius by the measured de Vaucouleurs half-light 
radius multiplied by a factor of (v2 — 1) ~ 0.414. Figure [2] 
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shows that the stellar contribution to the lensing signal is 
negligible in the range of scales used for our fits. Thus, we do 
not include a stellar component in our model of the cluster 
density profile. 

3.4 Fits to the lensing signal 

We perform fits to the lensing signal at scales R = 0.5- 
4.0 h Mpc, which is around the virial radii of clusters in 
our sample. This choice of fitting range allows us to obtain 
robust mass estimates (discussed in Sec. 14. 2[) . The stellar 
contribution to the lensing signal is negligible at these scales 
(see Fig. [2} • We therefore model the lensing signal as a sum 
of one-halo and halo-halo profiles. 

For any M^oop, we can calculate the one-halo and halo- 
halo profiles using Eqs. [6] El EGS and Q4] Given the ob- 
served lensing signal AE(iZ), we determine the best-fitting 
lensing profile by minimizing % 2 y using the smooth, analytic 
(diagonal) covariance matrix. We determine formal la errors 
on the best-fitting parameter M200P using the distribution 
of parameters obtained from many bootstrap resampled- 
datasets. This procedure incorporates correlations between 
the radial bins. 

Figure [2] shows representative examples of observed 
lensing signals and best-fitting profiles. The halo-halo term 
becomes important at scales R > 1.0 ft -1 Mpc. Neglecting 
to include this component would yield ~ 7 per cent larger 
mass estimates compared to fits that include it. 

3.5 Interpretation of the best-fitting mass 

The stacked weak lensing signal that we measure is the mean 
signal around a set of cl usters with a range of redsh ifts and 
masses. Previous studies (|Mandelbaum et al.ll2005br i and the 
quality of our fits indicate that the mean signal can be mod- 
eled as a s ingle NFW profile to a hig h degree of accuracy. 
Moreover, Ma ndelbaum et al.l (|2005b ) showed that if the 
mass distribution is narrow (with a typical width of less 
than a factor of ~ 5 in mass), this model is able to deter- 
mine the mean mass of the set of clusters accurately. If there 
is significant scatter in the mass distribution, then the clus- 
ter mass estimate falls between the distribution mean and 
median. 

Here, we consider two kinds of stacking processes: (a) 
over a set of clusters that lie within a narrow range of observ- 
able properties (e.g., richness or luminosity), and (b) over a 
set of clusters that satisfy a threshold in a given property. 
For case (a), we interpret the best-fitting mass M200P as an 
estimate of the mean mass of the clusters. We use this ap- 
proach to calibrate the mean relation between cluster mass 
and a given cluster observable property. 

For case (b), while M200P may not be a faithful estimate 
of the true mean mass because of the broad mass distribu- 
tion, it nevertheless allows us to assess the relative amount of 
scatter in a given mass-observable relation M = M(0). As- 
suming a monotonic mass-observable relation without scat- 
ter, rank ordering the clusters by an observable is the same 
as rank ordering them by mass. Thus, selecting the top N 
clusters by observable would select the JV most massive clus- 
ters. Moreover, if there are two tracers with no scatter they 
would produce the same sample, even if the functional forms 



M(0) differ. The effect of scatter is to bring in clusters with 
lower mass, which would lower the mean weak lensing sig- 
nal around the stacked clusters and the corresponding best- 
fitting mass. Thus, a higher best-fitting mass obtained from 
a given observable threshold at fixed number density indi- 
cates a lower scatter in the corresponding mass- observable 
relation. This analysis has be en worked out explicitly for th e 
case of log normal scatter in iMandelbaum fc Seliakl l|2007f) . 

Finally, we note that the mass that we measure from the 
weak lensing signal around stacked clusters may differ from 
other mass definitions, such as from spherical overdensity, 
because the presence of substructure and filaments introduce 
scatter between the two quantities. This scatter may be large 
if only a small number of clusters is stacked and one should 
quantify this with simulations, which is beyond the scope of 
this paper. Here we simply take lensing-defined mass as the 
mass definition. 



4 TESTS OF SYSTEM ATICS 

In this section, we discuss various tests of systematics asso- 
ciated with the cluster lens catalog, including photometric 
redshift errors (Sec. I4.1|l and offsets from the cluster centre 
( Sec. 14.2) . and with the weak lensing source galaxy catalog, 
including lensing calibration (Sec. 14. 3p and contamination 
from intrinsic alignments (Sec. 14.4)) . 

4.1 Cluster photometric redshift errors 

iKoester et all (|2007al lbh assessed the accuracy of photomet- 
ric redshifts (photo-z's) in the maxBCG catalog by compar- 
ing them with measured spectroscopic redshifts (available 
for ~ 40 per cent of the sample) . They found that the photo- 
z dispersion yj ((z p hoto z spcc ) 2 ) « 0.01, and is essentially 
independent of redshift for the range covered by the sample 
0.1 < z < 0.3. In this section, we investigate the effect of 
photometric redshift errors on our results. 

Cluster photo- z errors affect both the measurement of 
cluster properties and the computation of the lensing sig- 
nal. The reported luminosities in the maxBCG catalog were 
converted from apparent magnitudes using distances from 
photo-z's, so an overestimate in the redshift would result 
in a corresponding overestimate in the reported luminosi- 
ties. In addition, L200 and N200 would be affected because 
the change in both r2oo and L* would change which galax- 
ies would be considered cluster members by the maxBCG 
cluster finder. 

The lensing signal computation is affected in three ways: 
first, the lensing signal calibration depends on the lens- 
source geometry, and therefore on the assumed value for 
the cluster redshift; second, the conversion from angular 
distance to transverse separation depends on photometric 
redshift; third, the change in the observed property (lumi- 
nosity or richness) would change the bin in which a given 
cluster belongs. Generically, we expect the first two errors 
to cancel out at some level for any given cluster: e.g., if the 
lens photo- z is overestimated, then E c and hence AE are 
underestimated, but due to the error in the angular diame- 
ter distance we also overestimate the transverse separation 
R, which increases the signal at fixed transverse separation. 

Out of 5 423 BCGs (43 per cent of the sample) with 
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Figure 3. Examples of galaxies identified as BCGs in the maxBCG catalog with reported r band luminosities (fc-corrected to z = 0.25) 
of Lbcg> 16 x W 10 h-' 2 L e ; these images are taken from the SDSS DR6 Skyserver. From left to right: (a) SDSS J085540.19-003257.2 
(z = 0.271); (b) SDSS J212939.95+000521.1 (z = 0.234); (c) SDSS J085458.90+490832.3 (z = 0.052); and (d) SDSS J102246.44+483813.6 
(z = 0.050). Objects (a) and (b) have accurate photo-z's. These fields show a dominant cD galaxy (the BCG) surrounded by other red 
galaxies, typical of clusters in the catalog. Object (b)'s photo-z was successful despite the presence of [Oil], Ha, and [Nil] emission lines 
that are unusual for a BCG. Objects (c) and (d) have severely overestimated photo-z's (0.127 and 0.138, respectively). For (c), the error 
in the photometric redshift is probably due to the difficulty in deblending the overlapping galaxies. Object (d) seems to be a face-on 
spiral galaxy with thick dust lanes, which was mistaken for a BCG. We estimate the contamination of the catalog from such objects to 
be <2.4 per cent based on the incidence of very large errors in photometric redshift for those objects with spectra. 



measured spectroscopic redshifts, 131 galaxies (2.4 per cent) 
have severe photo-z errors, corresponding to differences in 
distance moduli larger than 0.5 magnitudes. The incidence 
of photo-z errors is much higher for BCGs with the highest 
reported luminosities, as expected since these extremely lu- 
minous objects are rare and a few photo-z failures on less lu- 
minous objects can lead to a large fractional contamination. 
Of the 49 objects with reported Lbcg> 16 x 10 10 h~ 2 L Q , 
12 per cent (6 objects) have severe photo-z errors. We show 
some examples in Fig. 

To test for the effect of lens photo-z errors on our 
weak lensing analysis, we divide the 5 423 clusters (with 
measured spectroscopic redshifts) into two redshift bins, 
0.10 < z < 0.23 and 0.23 < z < 0.30, and five bins in 
BCG luminosity. We calculate their lensing signal in two 
ways: (i) using photometric redshifts and the reported BCG 
luminosities, and (ii) using spectroscopic redshifts and BCG 
luminosities scaled to the measured spectroscopic redshifts. 
Figure [4] compares the measured lensing signals for the two 
cases. Note that the binning assignment is different in the 
two cases because of the difference in assumed BCG lumi- 
nosities. The lensing signals for the highest Lbcg bins tend 
to be noisier for case (ii) because these bins include very few 
objects once we correct for photo-z errors. Within the error 
bars, we find no systematic difference between the two cases. 
Therefore, for our main analysis, we use the full cluster sam- 
ple and the reported photometric redshifts and luminosities. 



4.2 Offsets from cluster centre 

BCGs are generally expected to lie at or near the centres of 
clusters, where the potential well is the deepest, but this is 
not always observed. Using iV-body mock galaxy catalogs, 
Ijohnston et alj (|2007| ) found that only ~60-80 per cent of 
the BCGs identified by the maxBCG cluster finder are lo- 
cated near the halo centre, and that the offsets of the rest 
of the BCGs can be modeled as a projected Gaussian dis- 
tribution with a width of 0.42 Mpc. These results must 
however be seen in light of the fact that the halos in the sim- 
ulations do not correspond exactly to clusters in the data. 



For our weak lensing measurements, we define the lo- 
cation of the BCG to be the centre of the cluster, but take 
steps to reduce the effect of offsets from the cluster cen- 
tre on the mass estimates. Fits for the concentration from 
the lensing profiles of clusters in the maxBCG catalog show 
that the effect of miscentering is important (leading to shal- 
lower derived concentrations and lower masses) when fits 
use transverse separations R < 0.5 h~ x Mpc, but not when 
the fits are restricted to R > 0.5 h~ l Mpc (Mandelbaum, 
et al 2008). Fitting from 0.2 instead of 0.5 /i" 1 Mpc tended 
to suppress the concentrations at the ~ 20 per cent level. 
Therefore, we restrict the fitting range to R > 0.5 h' 1 Mpc 
in this work. 



4.3 Lensing calibration 

Lensing calibration systematics due to the source sample 
include source redshift uncertainties, shear calibration, and 
stellar contamination. Since these effects do not vary with 
scale, they could only change the overall normalization in 
the derived mass-observable relation. 

Comparison with spectroscopy from DEEP2 and zCOS- 
MOS showed that to account for photometric redshift errors 
in the source redshifts, one has to multiply the signal by a 
calibration factor of 0.97 ± 0.02 for the 0.10 < z < 0.23 
sam ple, and 0.98 ± 0.04 for the 0.23 < z < 0.30 sam- 
ple ifMandclba um et al.l|2008l ). Stellar contamination in the 
source catalog, which would decrease the lensing signal, is 
tightl y constrained to less than 1 per cent using COSMOS 
data (M andelbaum et al.ll2008T ). Taking this into account, 
the calibration factors become 0.98 ±0.02 for the 0.10 < z < 
0.23 sample and 0.99 ± 0.04 for the 0.23 < z < 0.30 sam- 
ple. Since these are within la of unity and are much smaller 
than the statistical error bars on the weak lensing signal, we 
choose not to apply these correction factors in this work. A 
conservative estimate of the total calibration uncertainty, in- 
cluding both these two effec ts and the shear calibratio n bias, 
is 8 per cent at the la level l|Mandelbaum etlrt 1l2005al ). This 
can be taken into account by adding it in quadrature to the 
statistical error on the mass determinations. 
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Figure 4. Test of systematics for the effect of cluster photo- 
z errors. Clusters are divided into two ranges in redshift (up- 
per and lower panels) and five bins in BCG luminosity (the four 
highest luminosity bins are shown above, with the mean Lbcg 
listed in units of 10 10 K~ 2 Lq). The stacked weak lensing signal 
around clusters in each bin is calculated in two ways: (i) using 
photometric redshifts and the reported BCG luminosities (filled 
circles/black), and (ii) using spectroscopic redshifts and BCG lu- 
minosities scaled to the spectroscopic redshifts (crosses/red). The 
best-fitting one-halo + halo-halo profiles are shown in each case 
(solid and dashed curves, respectively). The data points have been 
slightly offset horizontally for clarity. The vertical dashed line 
marks the range of scales used in our fits R =0.5-4.0 h' 1 Mpc. 



4.4 Intrinsic alignments 

The important intrinsic alignment effect for cluster-galaxy 
lensing is the alignment between the intrinsic ellipticity of 
a galaxy and the direction to nearby cluster BCGs. This 
effect comes into play because we necessarily include some 
physically-associated pairs (i.e., pairs of lenses and "sources" 
that are really part of the same local structure); if these 
sources preferentially align tangentially or radially relative 
to the lens, they would provide an additive bias to the lens- 
ing signal. 

The effect of intrinsic alignments on the lensing pro- 
file is more important at small transverse separations, since 
close physically associated pairs tend to be more aligned. Us- 
ing th e same source catalog used here, iMandelbaum et al.l 
(2006) found that intrinsic alignment contamination of the 
lensing signal for luminous red galaxies (LRGs) is only 
important at scales R < 0.1/i -1 Mpc, given our proce- 
dures for removing physically associated galaxies from the 
source sample. Since many cluster BCGs are also in this 
LRG sample, this result is relevant for the current work. 
lAgustsson fc Brainerdl l|2006l ) measured the mean tangential 
shear of spectroscopically determined satellites and found 
a tendency for satellites to align radially towards central 
galaxies over the range 7 < R < 50 h" 1 kpc. Since we 
have used photometric redshift estimates to separate source 
galaxies from lenses, and we only use the lensing signal data 
in the range R = 0.5-4.0 h' 1 Mpc, our results should not 
be affected by contamination from intrinsic alignments. 

Nonetheless, we present constraints on intrinsic align- 
ments contamination of the lensing signal for the transverse 
separations used here. To do so, we use the formalism and 
results on intr insic alignments in LRG len ses with our source 
catalog from IMandelbaum et al] l)2006h . For the "bright" 
lens sample in that work (corresponding to halo masses of 
~ 7 x 10 13 h~ x Mq), the intrinsic alignment signal was not 
detected at 0.5-0.6 ft -1 Mpc, and was constrained to con- 
taminate the lensing signal by < 3 IiMqpc~ 2 at 95 per cent 
CL. This constraint is in fact conservative, since there is 
reason to believe that the sample of red "source" galaxies 
that we used to place the constraint is more strongly intrin- 
sically aligned than the general galaxy population. Given 
that the typical lensing signal for the maxBCG clusters on 
these scales is more than 20 times larger than this conserva- 
tive bound, we conclude that it is not an important contami- 
nant for this work. For larger scales, it is also not important, 
since the effect is expected to decrease with transverse sepa- 
ration, as does the fraction of physically associated "source" 
galaxies. 



5 RESULTS 

In this section, we calibrate and assess the scatter in the 
relation between several cluster properties and cluster mass, 
as outlined in Sec. 13.51 In Sec. 15.11 we consider three main 
observable parameters- cluster richness in red galaxies -/V200 , 
cluster luminosity in red galaxies L200, and luminosity of 
the brightest cluster galaxy Lbcg ■ In Sec. 15.21 we consider 
power- law combinations of -/V200 and L200 with Lbcg, with 
the aim of finding improved mass tracers for galaxy clusters. 
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Number 


Range 


(iV 2 oo) 


(L200) 


(£bcg) 


M20OP 








lO lo /i- 2 L 


lO lo ft- 2 L 


lO 14 ft,- 1 M 






Bins 


in -/V200 






4091 


10-11 


10.43 


16.29 


4.67 


0.65 ± 0.30 


5164 


12-17 


13.88 


21.67 


5.27 


0.96 ± 0.32 


2055 


18-25 


20.78 


32.38 


6.21 


1.43 ± 0.42 


933 


26-40 


31.06 


48.40 


7.05 


2.48 ± 0.57 


320 


41-70 


50.06 


76.64 


8.24 


3.96 ± 0.77 


49 


71-190 


89.86 


140.87 


10.45 


8.72 ± 1.40 






Bins 


m -L200 






4091 


6.63-17.56 


11.29 


14.17 


3.51 


0.56 ± 0.29 


5164 


17.56-28.51 


13.89 


22.22 


5.57 


1.12 ± 0.33 


2055 


28.51-41.76 


20.01 


33.73 


7.07 


1.46 ± 0.43 


933 


41.76-64.46 


29.90 


50.37 


8.10 


2.47 ± 0.59 


320 


64.46-115.55 


52.95 


88.44 


9.83 


4.13 ± 0.82 


49 


115.55-274.71 


85.14 


146.91 


12.44 


10.57 ± 1.44 






Bins 


in Lbcg 






4091 


0.66-3.95 


13.47 


16.84 


2.95 


0.71 ± 0.31 


5164 


3.95-6.56 


16.13 


24.88 


5.15 


1.00 ± 0.32 


2055 


6.56-8.90 


18.56 


32.35 


7.57 


1.69 ± 0.42 


933 


8.90-11.73 


21.56 


40.73 


10.02 


2.40 ± 0.55 


320 


11.74-16.68 


25.31 


50.45 


13.40 


3.28 ± 0.83 


49 


16.68-29.05 


34.78 


74.61 


19.74 


6.77 ± 1.57 



Table 1. Individual bins of clusters rank ordered according to: N200 (cluster richness in red galaxies), L200 (cluster luminosity in red 
galaxies), and Lbcg (luminosity of the brightest cluster galaxy). The number of clusters in each bin, their range of properties, mean 
^200 > ^200 1 i'BCGi an d the estimated mean cluster mass M20OP are listed. The la errors on the mass estimates are derived from 2500 
bootstrap-resampled datasets. 



5.1 N200, L200 and Lbcg as Mass Tracers 

5.1.1 Calibration of mean mass-observable relations 

We begin by calibrating the mean relation between cluster 
mass and three cluster properties, N200 (cluster richness in 
red galaxies), L200 (cluster luminosity in red galaxies), and 
Lbcg (luminosity of the brightest cluster galaxy). We rank 
order the clusters in each property and divide them into six 
individual bins, keeping the same number of clusters in each 
bin (Table [T} . We measure the stacked weak lensing signal 
around clusters in each bin, and determine the best-fitting 
mass M200P using the procedure described in Sec. 13.41 We 
do this analysis for the full redshift range 0.1 < z < 0.3. The 
results are shown in Table fT] and Fig. [5] 

The scaling of mean cluster mass with N200 , £200 , and 
Lbcg are well-described by power laws. To determine the 
normalization and slope in these relations, we minimize x 2 
simultaneously for the six sets of measured lensing signals. 
We determine uncertainties on the parameters by repeat- 
ing the fitting procedure for the 2500 bootstrap-resampled 
datasets. The best-fitting relations are: 

Afi4(JVaoo) = (1-42 ± 0.08)(JV 20 o/20) lle±0 - 09 (15a) 
AL 4 (L 20 o) = (1.76 ± 0.17)(L 2 oo,io/40) 1 - 40±0 ' 19 (15b) 
M 14 (Lbcg) = (1.07 ± OmXLBCG.io/S) 1 ' 10 * ' 13 (15c) 

where M14 is M200P in units of 1O 14 /i -1 M0, and I/20o,io and 
Lbcg, 10 are in units of 10 10 h~ 2 Lq. From the covariance 
matrix of the best-fitting parameters, we find that the slope 
and normalization are uncorrelated for the -/V200 relation, 



and anti-correlated at the ~ 50 — 60 per cent level for the 
L200 and Lbcg relations. 

The mass-L2oo relation Eq. I15bl is derived using only 
clusters with £200,10 > 28, where the sample is complete 
in L200 (Fig. [TJ. Thus, it is not affected by the selection 
effect introduced by the N200 ^ 10 cut. If we include 
the full sample in the analysis, we find M2oop,i4(^20o) = 
(1.96±0.11)(L 2 oo,io/40) 113±0 08 . The derived slope is shal- 
lower than that in Eq. I15bl consistent with the effect of 
missing lower mass clusters at low luminosity. The sample 
is incomplete at all values of Lbcg, so the shallow slope of 
Eq. I15cl is partly due to this selection effect; it should there- 
fore be kept in mind that this relation is valid only for the 
richness-selected sample. For a sample complete in Lbcg, 
the slope would likely be stee per and the mean m ass at low 
Lbcg would be lower (e.g., iLin fc Mohrl {2004) find that 
BCG luminosity scales with halo mass with an exponent of 
0.33 ±0.06, which implies a much steeper relation than that 
in Eq. I15c[l . 

The slopes we find for the scaling of mass with 
N200 and L200 are r oughly consistent with the results of 
IJohnston et~ai1 l|2007l ) (Tables 10 and 11 list 1.30 and 1.25, 
respectively, for the mass definition closest to ours Migob), 
though it should be noted that they use additional clusters 
(with N200 < 10). Our normalization for the mass-richness 
relation is higher by ~18 per cent, but this can be explained 
by our use of different methods for determining photometric 
redshifts of source galaxies, which leads t o different amounts 
of bias in the estimated lensing signals. (jMandelbaum et al.l 
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Figure 5. Scaling of cluster mass M200P with various cluster mass 
tracers. Masses are determined from the stacked weak lensing sig- 
nal around clusters in individual bins in A^oo (filled circles), L200 
(open circles), and -Lbcg (crosses). The upper panels show the 
scaling of cluster mass with mean parameters {N200) an d (£200); 
the dashed line on the upper right panel shows the L200 value 
above which the sample is complete. The lower panels show the 
scaling of cluster mass with a combination of the mean parame- 
ters, with exponents taken from Table[2]m Sec. 15. 2.11 The tighter 
scaling of cluster mass with the combined tracers, regardless of 
whether iV2uO, ^200, or Lbcg is used for the binning, suggests 
that these combined quantities trace mass more faithfully than 
either A^oo or ^200 taken alone. 

2008) tested for calibration bias in the lensing signal due to 
use of different methods of determining source redshifts, us- 
ing source galaxies with spectroscopy from zCOSMOS and 
DEEP2 as a reference. They concluded that for the maxBCG 
lens redshift distribution and the methods used here of de- 
termining source redshifts, the calibration bias in the lensing 
signal is small (consistent with zero within our quoted sys- 
tematic error ), whereas for the SP SS DR6 neural net pho- 
toz's used by I Johnston et all (I2007I ). it is approximately -18 
per cent. 

5.1.2 Scatter in the mass-observable relations 

In this Section, we assess the relative amount of scatter in 
the various mass-observable relations derived above. As dis- 
cussed in Sec. 13.51 an observable threshold that yields a 
higher best-fitting mass has a mass relation with lower scat- 
ter. We define thresholds corresponding to cluster comoving 
number densities of n = {20, 10, 5, 2.5} x 10~ 7 (/i _1 Mpc)~ 3 . 
This translates to taking the top {384, 192, 96, 48} clusters 
for the 0.10 < z < 0.23 sample, and the top {456, 233, 116, 
58} clusters for the 0.23 < z < 0.30 sample. We measure the 
stacked weak lensing signal for each threshold in A/200, £200, 
and Lbcg and compare the derived best-fitting masses in 
Fig. H 

Out of the three parameters considered, we find that 



Lbcg is the poorest tracer of cluster mass. This statement 
is robust to the selection effect introduced by the -/V200 ^10 
cut, since the inclusion of poorer, low-mass clusters into the 
threshold would further decrease the lensing signal. We note 
that the scatter in the mass relation is a combination of in- 
trinsic and observational scatter, and the contribution from 
the latter may be significant because of the difficulty in mea- 
suring accurate BCG luminosities. For example, systematic 
errors from sky subtraction are important for BCGs because 
they have large, diffuse envelopes, and deblending issues are 
also important because BCGs are located in dense environ- 
ments. 

Figure [6] shows that the best-fitting masses M200P for 
clusters in A/200 and L200 thresholds at the same number 
density tend to be comparable. However, about 70-80 per 
cent of the clusters selected by the A/200 threshold is also 
selected by the corresponding L200 threshold. Thus, the er- 
ror bars in these data points are tightly correlated, and the 
differences in the masses are more significant than what 
one would estimate by eye. We therefore assess the statisti- 
cal significance of these differences using results from many 
bootstrap datasets. We find that the A^oo threshold yields 
a higher mass than the L200 threshold in {72, 45, 93, 68} 
per cent of the cases (for the 0.10 < z < 0.23 sample), 
and for {37, 68, 27, and 90} per cent of the cases (for the 
0.23 < z < 0.30 sample), in order of decreasing number den- 
sity. These high values indicate that -/V200 picks out more of 
the most massive clusters most of the time, and therefore 
has smaller scatter than L200 at this range of masses. Fig- 
ure |S] also shows for comparison the masses obtained from 
thresholds in the combined mass tracers, which we discuss 
in Sec. IS321 

5.2 Combined mass tracers 

In this section, we consider whether adding information from 
BCG luminosity can provide improved estimates of cluster 
masses. Our previous analysis shows that Lbcg by itself 
does not trace mass as well as A/200 or 1/200- However, the 
scatter in Lbcg at a fixed A^oo or L200 suggests that there 
may be residual scaling of mass with Lbcg- Figure [5] shows 
that the scaling of mass with a combination of A200 (or 
L200) and Lbcg (lower panels) is tighter than that with 
A/200 or L200 taken alone (upper panels), regardless of the 
parameter used for binning the clusters. This suggests that 
the additional information in Lbcg reduces the scatter in 
the mass relation. Here, we consider power law combinations 
of Lbcg with A^oo (or L200) as mass tracers. We calibrate 
the mass relation in Sec. 15.2. H and assess the scatter in this 
relation in Sec. 15.2.21 

5.2.1 Calibration of mean mass-observable relations 

To consider the scaling of mass with both -/V200 and Lbcg 
simultaneously, we divide the cluster sample into five bins in 
A/200 and further split these bins in Lbcg, for a total of 22 
bins in the two-dimensional A^oo-Lbcg space. We make a 
similar division in L200-LBCG space for clusters with L200 > 
28 (for which the sample is complete) resulting in nine bins. 
We then measure the stacked weak lensing signal around 
clusters in each bin. We do this analysis for two redshift 
ranges, 0.10 < z < 0.23 and 0.23 < z < 0.30. 
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Figure 6. Comparison of the relative amount of scatter in the various mass tracers. Higher values of the best-fitting cluster mass, 
-M200p; indicate a lower scatter in the mass relation. Upper panels: Cluster masses M20OP from stacked weak lensing signals around 
clusters satisfying thresholds in the various tracers, for comoving number densities n = {20,10,5, and 2.5} X 10 — r (h~ 1 Mpc)~ 3 . We 

«j(best) „(hest) 

compare the mass tracers N200, L200 > iBCGi N200 ^bcg an< ^ ^200 ^bcg ' ^ e com bi ne d tracers that yield the highest masses at 
each number density (Sec. 15. 2. 21 . Left and right plots are for the two redshift ranges; the data points in the figure are slightly offset 
horizontally for clarity. The la error bars shown here are tightly correlated, so the differences in the masses are more significant than 
apparent by eye. Lower panels: Probability that the f3^° st ^ tracer yields a higher mass than N200 (filled circles/black), L200 (open 
circles/blue), or I/bcg (crosses/red) taken alone, defined to be the percentage of cases among 1000 bootstrap-resampled datasets. High 
values of this quantity suggest that the combined tracers have comparable or lower scatter than either N200 or L200 taken alone, for this 
range of cluster abundances. 
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Figure 7. Scaling of mean cluster mass M200P with Lbcg within narrow bins in N200- The best-fitting mass relation M(7V200> ^bcg) 
(given by Eq. I16a.l l are shown in solid lines. The mass relation without the Lbcg dependence (i.e., with ■ypf = 0) are shown in 
dashed lines. We find residual scaling with 7jv = 0.71 ± 0.14 in the lower redshift sample (left), and with 7jv = 0.34 ± 0.24 in the 
higher redshift sample (right). 
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We parametrize the scaling of mass as a power law in 
A200 (or L200) with an additional scaling with Lbcg at fixed 
A200 (or L200): 



Ml 4 (A^200,iE 



M^(iV2oo/20) QN (LBCG/i rj c ) G) 7N (16a) 



BCG / 

Af 14 (L2oo,i B CG) = ME(L 2 oo,io/40) Q!i (LBCG/-& B CG) 7 ' L ( 16b ) 



where Mi 4 is M200P m units of 10 h 

i10l-2 



M , 1/200,10 is the 



cluster luminosity in units of 10 h~ Lq, and the BCG lu- 
minosity dependence is pivoted at the mean Lbcg at the 
given -/V200 ( or £200)- Parametrizing this mean relation as a 
power law, the best-fitting relations are: 



r(JV) 
BCG 



Lbcg(N 2 oo) = a N N 2 Q Q 



;(L 2 



(17a) 
(17b) 



where a N = (1.54, 1.64) x 10 1(1 h~ 2 L o , 6jv = (0.41, 0.43) and 
a L = (0.61,0.58) x W 10 h- 2 L @ , b L = (0.67,0.66) for the 
two redshift ranges (0.10 < z < 0.23, 0.23 < z < 0.30). 
Combining Eqs. ll6l and ll7| gives a cluster mass estimate for 
any cluster with measured A200 (or L200) and Lbcg- 

We derive best-fitting parameters M°, a and 7 (shown 
in Table [2j by minimizing \ 2 simultaneously for the set 
of measured lensing signals. To obtain confidence inter- 
vals on these fits, we repeat the fitting procedure for the 
1000 bootstrap-resampled datasets, using the analytical co- 
variance matrix (rather than the full bootstrap covari- 
ance matrix, which is too noisy to use to weight the fits). 
The bootstrap-resampled datasets yield Gaussian probabil- 
ity distributions in M°, a and 7; the la errors, and cor- 
relation coefficients for these parameters are also shown in 
Tabled 

Comparison of the best-fitting mass relations for the 
two redshift ranges suggests an increase in cluster mass with 
redshift at fixed richness. Using the 1000 bootstrap resam- 
pled datasets, we find that the mass normalization for the 
higher redshift sample is larger than that for the lower red- 
shift sample at ~97 per cent CL. We note however that the 
redshift dependence may result from systematic effects due 
to photo-z errors, which have a larger dispersion at lower 
redshifts, and/or from evolution in the richness estimator 
A200 (e.g., due to an incorrect assumption of the evolu- 
tion of the luminosity cut 0.4L*). Disentangling these ef- 
fects from "true" evolution requires a more careful control 
of the systematics. Hints of an increase in cluster mass with 
redshift at fixed ^2 00 have been found in measurements of 
X-rav luminosities (|Rykoff et alj|2007l ) and velocity disper- 
sions llBecker et al ]|2007f ) of clusters in the maxBCG catalog, 
but no evidence of evolution had bee n detected in a previou s 
analysis of their weak lensing signal l|Sheldon et alj|2007af ) . 

Figures [7] and [5] show the scaling of cluster mass M200P 
with Lbcg within narrow bins in A200 and L200 • These scal- 
ings are traced well by the best-fitting relations Eqs. ll6al and 
I16bl At fixed A200, residual scaling with Lbcg is seen with 
7 at = 0.71 ± 0.14 (~ 5a) for the lower redshift sample, and 
with 7iv = 0.34 ± 0.24 for the higher redshift sample. At 
fixed L200, we find 7l = 0.40 ± 0.23 (~ 2a) for the lower 
redshift sample, and 7l = 0.26 ±0.41 for the higher redshift 
sample. Constraints for the scaling with L200 are relatively 
weaker because of the luminosity cut applied to the com- 
plete sample, which reduces the number of clusters to about 
one-third of the full sample. The scaling parameters are less 
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Figure 8. Scaling of mean cluster mass M200P with Z/bcg within 
narrow bins in Z/200- The mean L200 in each bin is shown in 
units of 10 10 /i — 2 Lq; we restrict this analysis to L200 > 28 X 
10 10 /i _2 Z/q , for which the sample is complete. We find residual 
scaling of M20OP with Lbcg a t fixed I/200 in the lower redshift 
sample (upper panels), with 7^ = 0.40 ± 0.23 (~ 2a) and no sig- 
nificant evidence for residual scaling in the higher redshift sample 
(lower panels), with yi, = 0.26 ± 0.41. 



well-constrained for the higher redshift range because there 
are fewer lensed sources behind the high-redshift clusters. 



5.2.2 Scatter in the mass-observable relations 

We turn to the question of whether exploiting informa- 
tion about BCG luminosity in addition to either A^2oo 
or L200 reduces the scatter in the mass relation. Similar 
to Sec. 15.1.21 we rank clusters according to Z/20o£bcg 
and A r 2ooi/BCG ,3jv and take the top TV clusters to de- 
fine thresholds with comoving number densities n — 
{20, 10,5,2.5} x 10- 7 {h^Mpc)- 3 . We explore a set of val- 
ues of exponents, f3 N = {0.25,0.50,0.75,1.0,1.5,2.0} and 
(3l — {0.2, 0.4, 0.6, 0.8, 1.0}, to find the one that maximizes 
M200P, or equivalently, minimizes the scatter in the mass- 
observable relation. We do this analysis for two redshift 
ranges, 0.10 < z < 0.23 and 0.23 < z < 0.30. 

The exponents that yield the highest masses at each 
number density are (from highest to lowest number den- 



sity): P° 



(best) 



{1.5,1.5,0.25,0.25} and PI 



(best) 



lower redshift sample and /J)^ est ' 



P 



(best) 



0.4 for the 



{1.5,1.0,0.5,0.5} and 



{0.8, 0.6, 0.6, 0.2} for the higher redshift sample. In 



general, the tracer with the minimal scatter is a combination 
of 7V200 and Lbcg (except for n — 2.5 x 10~ 7 (/i _1 Mpc)~ 3 in 
the higher redshift sample, where A^oo alone yields the high- 
est mass; one possible reason for this trend is that at higher 
redshifts, the large Lbcg bins are more likely to be contam- 
inated by low luminosity objects for which the photo-z has 
been overestimated (Sec. 14. If ). 

The error bars are tightly correlated between the com- 
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IN 


r(M° N ,a N ) 


r(M° N , lN ) 


r(aff, 7jv) 


0.10 < z < 0.23 
0.23 < z < 0.30 


1.27 ±0.08 
1.57 ±0.14 


1.20 ± 0.09 
1.12 ± 0.15 


0.71 ± 0.14 
0.34 ± 0.24 


-0.24 
-0.07 


-0.40 
-0.18 


0.03 
0.09 




Mi 


OIL 




r(M l l,a L ) 




r(a L ,-y L ) 


0.10 < z < 0.23 
0.23 < z < 0.30 


1.81 ±0.15 
1.76 ± 0.22 


1.27 ± 0.17 
1.30 ± 0.29 


0.40 ± 0.23 
0.26 ± 0.41 


-0.34 
-0.42 


-0.17 
-0.35 


0.34 
0.41 



Table 2. Best-fitting parameters for the scaling of cluster mass with N200 an <i ^BCG ffiq. I16al l. and with L200 arl d ^BCG (Eg. I16bl . 
The la errors and correlation coefficients r in the table are derived from 1000 bootstrap-resampled datasets. 



bined and individual tracers, as well as between different 
Pn or (3l values, because a significant fraction of the clus- 
ters that satisfy the different thresholds are the same. For 
example, for the lowest number density bin n = 2.5 x 
10 _7 (/i -1 Mpc) -3 , there is substantial overlap between clus- 
ters satisfying the threshold in f3^ ast ^ and in A200 (94 per 
cent), L200 (83 per cent), and Lbcg (26 per cent). We as- 
sess the significance of the differences in the masses using 
the 1000 bootstrap-resampled datasets. We find that the 
combined tracers with exponents pj^ yield higher masses 
than -/V200, £200, or Lbcg in the majority of cases (> 50 per 
cent), for the range of number densities we consider. 

We emphasize that this result is relevant even if we 
are not complete in Lbcg or L200, in the sense that this is 
the estimate that minimizes the scatter among the clusters 
we have. This does not imply that we could not have an 
even better sample if we included clusters with 7V2oo< 10 
for which Lbcg is high. However, from Fig. [T] we see that 
we are complete for A^ooLbcg 10 > 80, so our results are 
not affected by incompleteness for number densities below 
5 x 10~ 6 (/i^Mpc)" 3 . 

Together with the results of Sec. 15.2.11 these findings 
suggest that additional information from Lbcg provides im- 
proved determination of cluster masses, both in the mean 
and the scatter of the mass-observable relation. 



6 SUMMARY AND CONCLUSIONS 

We considered optical parameters that are available in large 
samples of clusters of galaxies: cluster richness A/200 , cluster 
luminosity L200, and the luminosity of the brightest cluster 
galaxy Lbcg, as well as power law combinations of A^oo 
with Lbcg, and L200 with Lbcg, to determine which is the 
best mass tracer for clusters. 

We calibrate the mean mass relation for these trac- 
ers by measuring the stacked weak lensing signal around 
clusters rank ordered according to a given parameter. Our 
best-fitting mass relations for -/V200 and L200 are given in 
Eqs. I15al and I15bl We then ask whether the weak lens- 
ing signal changes significantly when a second parameter 
is added to the first one. We can exploit any such resid- 
ual scaling to derive improved, lower-scatter mass tracers. 



M14 = (1.27 ±0.08) 



N 2 , 



20 



/ Lbcg 
\Lbcg(A?2oo), 



We explore such tracers in the form N% 



JBCG 



and 



L2oo aL Lbcg~' l ■ The best-fitting mass relations are given in 
Eqs. I16al and I16bl with parameters given in Table [5] The 
best mass tracer M200P (in units of 10 14 /i _1 Mq) we find is 
(for the lower redshift sample): 



1.54 A^oo 1 x lO lo /i" 2 L is the mean 



where Lbcg (^200 
BCG luminosity at a given richness. 

Our results suggest that Lbcg is an important second 
parameter in addition to A/200 and L200 ■ At fixed A/200 , resid- 
ual scaling with Lbcg is seen at the ~ 5a level in the lower 
redshift sample (0.10 < z < 0.23), and at the ~ 1.5<r level 
in the higher redshift sample (0.23 < z < 0.30). The need 
for a second parameter is less evident when L200 is used as 
the primary variable instead of A/200; we find that residual 
scaling with Lbcg is preferred at the ~ 2a level in the lower 
redshift sample, and find no evidence for residual scaling in 
the higher redshift sample. 

We assess the relative amount of scatter in the various 
mass-observable relations by measuring the stacked weak 
lensing signal around clusters satisfying thresholds in each 
parameter. For a given comoving number density of clusters, 
low-scatter mass tracers will select more of the most mas- 
sive clusters in the sample and thus yield a stronger lensing 
signal, compared to a large-scatter mass tracer. Among the 
parameters N200, L200, and Lbcg, cluster richness is the best 
mass tracer for clusters, while Lbcg is the poorest tracer. 
We find that a combined tracer of the form A 7 2ooLbcg' 3jv 
reduces the scatter in the mass relation compared to cluster 
richness taken alone, for the most massive clusters in the 
sample. 

Fro m SPSS spectrosco py of clusters in the maxBCG 
catalog, iBecker et al.l (|2007T ) found residual scaling of veloc- 
ity dispersions with BCG luminosity Lbcg at fixed richness 
A200- Our results consequently confirm that this residual 
scaling also appears in the projected mass distributions. 

Our results are consistent with the current picture of 
cluster formation from halo mergers. A'-body simulations 
and semi-analytic models find that at a fixed mass, dark 
matter haloes which form earlier have brighter, redder cen- 
tral s ub-haloes (i.e., brighter, redder BCGs) a nd lower rich- 
ness (|Wechsler et al.ll2006l : [Croton et alfeOOTT ). This may re- 
sult from the satellites having had more time to merge onto 
the BCG, lowering the richness from when the cluster formed 
while enhancing the BCG luminosity. This implies that N200 
and Lbcg are anti-correlated at fixed mass, and provides an 
explanation for our result above, i.e., that a combination 
of these two observables yields a tighter relation with mass 
than either of them taken alone. 

The weaker residual scaling with Lbcg when using L200 
instead of A/200, suggests that the anti-correlation between 
L200 and Lbcg at fixed mass is much weaker; this is also 
consistent with the above scenario, since the luminosity of 
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the BCG is included in the cluster luminosity. Moreover, this 
result constrains the amount of light that has been lost to 
the intra-cluster medium due to the merging of red satellite 
galaxies with the BCG since the formation of the cluster. 
If this was a significant fraction of the cluster luminosity in 
red galaxies, L200 would be lower for earlier-forming clusters, 
and therefore anti-correlated with Lbcg- We do not detect 
such an effect, so our results are consistent with a scenario 
where the cluster luminosity in red galaxies remains approx- 
imately constant over time. 

Independent of the underlying astrophysical mecha- 
nisms, the improved mass tracers we found can be used to 
obtain accurate mass estimates and define mass thresholds 
in cluster samples with optical data. These in turn can be 
used to provide more precise constraints on cosmological 
parameters, such as the amplitude of mass fluctuations as, 
which will be the subject of future work. 
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